Labeled Grammar Induction with Minimal Supervision

نویسندگان

  • Yonatan Bisk
  • Christos Christodoulopoulos
  • Julia Hockenmaier
چکیده

Nearly all work in unsupervised grammar induction aims to induce unlabeled dependency trees from gold part-of-speechtagged text. These clean linguistic classes provide a very important, though unrealistic, inductive bias. Conversely, induced clusters are very noisy. We show here, for the first time, that very limited human supervision (three frequent words per cluster) may be required to induce labeled dependencies from automatically induced word clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probing the Linguistic Strengths and Limitations of Unsupervised Grammar Induction

Work in grammar induction should help shed light on the amount of syntactic structure that is discoverable from raw word or tag sequences. But since most current grammar induction algorithms produce unlabeled dependencies, it is difficult to analyze what types of constructions these algorithms can or cannot capture, and, therefore, to identify where additional supervision may be necessary. This...

متن کامل

Learning Data Driven Representations from Large Collections of Multidimensional Patterns with Minimal Supervision

Learning Data Driven Representations from Large Collections of Multidimensional Patterns with Minimal Supervision by Parvez Ahammad Doctor of Philosophy in Engineering—Electrical Engineering and Computer Sciences University of California, Berkeley Professor S. Shankar Sastry, Chair Traditionally, taking experimental measurements of a physical or biological phenomenon was an expensive, laborious...

متن کامل

یک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی

In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...

متن کامل

Evaluating Induced CCG Parsers on Grounded Semantic Parsing

We compare the effectiveness of four different syntactic CCG parsers for a semantic slotfilling task to explore how much syntactic supervision is required for downstream semantic analysis. This extrinsic, task-based evaluation also provides a unique window into the semantics captured (or missed) by unsupervised grammar induction systems.

متن کامل

Supervised Grammar Induction using Training Data with Limited Constituent Information

Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is prohibitively expensive. This work aims to improve the induction strategy when there are few labels in the training data. We show that the most informative linguistic constituents are the higher nodes in the parse tre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015